CS224N Final Project: Movie Title Recognition in E-Mails
نویسنده
چکیده
For this project, a system was designed to rst identify whether or not an email mentioned a movie, and if it does, to extract the title, time and date of the movie in question. The classi er used to determine whether an email is a movie or non-movie is an extension of the Naive Bayes classi er. The classi er was fairly successful in terms of precision, although it tended to yield a lot of false positives. A named entity recognition system using MEMM was utilized to tag movie titles, locations, addresses, dates, and times. The NER system saw much less success than the classi er, although some labels like 'time' did fairly well. The dataset used to train both the classi er and the NER turned out to be fairly small so with more training data the system could see some improvement.
منابع مشابه
CS224n Final Project
I introduce a novel method for disambiguating word senses using a semisupervised approach. I contrast this method with the current state-of-the-art approaches and show that my approach performs well and could potentially lead to fully unsupervised approaches with high accuracy.1
متن کاملE-politeness in Iranian English Electronic Requests to the Faculty
This paper reports the findings of a study designed to investigate English e-requestsof Iranian EFL postgraduate students (i.e., nonnative speakers of English) made totheir professors during their education at Islamic Azad University, Najaf AbadBranch, Isfahan, Iran, to find out types of politeness features employed in the students’e-mails and the extent to which these features might influence ...
متن کاملManaging Personal Information by Automatic Titling of E-mails
This paper presents an approach that enables automatic titling of e-mails relying on the morphosyntactic study of real titles. Automatic titling of e-mails has two interests: Titling mails ’no object’ and managing personal information. The method is developed in three stages: Candidate sentences determination for titling, noun phrases extraction in the candidate sentences, and finally, selectin...
متن کاملPersonal Semantic Data
This paper presents an approach that enables automatic titling of e-mails relying on the morphosyntactic study of real titles. Automatic titling of e-mails has two interests: Titling mails ’no object’ and managing personal information. The method is developed in three stages: Candidate sentences determination for titling, noun phrases extraction in the candidate sentences, and finally, selectin...
متن کاملAn Efficient Two-phase Spam Filtering Method Based on E-mails Categorization
The e-mail’s header session usually contains important attributes such as e-mail title, sender’s name, sender’s email address, sending date, which are helpful to classification of e-mails. In this paper, we apply decision tree data mining technique to header’s basic attributes to analyze the association rules of spam e-mails and propose an efficient spam filtering method to accurately identify ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2009